Data Mining and Model Simplicity: A Case Study in Diagnosis
نویسندگان
چکیده
We describe the results of performing data mining on a challenging medical diagnosis domain, acute abdominal pain. This domain is well known to be difficult, yielding little more than 60% predictive accuracy for most human and machine diagnosticians. Moreover, many researchers argue that one of the simplest approaches, the naive Bayesian classifier, is optimal. By comparing the performance of the naive Bayesian classifier to its more general cousin, the Bayesian network classifter, and to selective Bayesian classifiers with just 10% of the total attributes, we show that the simplest models perform at least as well as the more complex models. We argue that simple models like the selective naive Bayesian classifier will perform as well as more complicated models for similarly complex domains with relatively small data sets, thereby calling into question the extra expense necessary to induce more complex models.
منابع مشابه
Comparison of Three Decision-Making Models in Differentiating Five Types of Heart Disease: A Case Study in Ghaem Sub-Specialty Hospital
Introduction: cardiovascular diseases are becoming the main cause of mortality and morbidity in most countries. This research goal was to predict the types of heart diseases for more accurate diagnosis by data mining and neural network technics. Method: This research was an applied-survey study and after data preprocessing, three approaches of neural network, decision making tree and Bayes simp...
متن کاملComparison of Three Decision-Making Models in Differentiating Five Types of Heart Disease: A Case Study in Ghaem Sub-Specialty Hospital
Introduction: cardiovascular diseases are becoming the main cause of mortality and morbidity in most countries. This research goal was to predict the types of heart diseases for more accurate diagnosis by data mining and neural network technics. Method: This research was an applied-survey study and after data preprocessing, three approaches of neural network, decision making tree and Bayes simp...
متن کاملA Case Study of the Impact of Parental Diseases on the Probability of Hypertension Using Data Mining Techniques
Introduction: Hypertension is one of the most common health problems. As it has a major impact on other serious diseases such as cardiovascular diseases and strokes, and due to not having any specific symptoms, it is known as a silent killer. Therefore, proper diagnosis, control, and treatment of hypertension is crucial in health care systems and will indeed prevent the development of the other...
متن کاملA Case Study of the Impact of Parental Diseases on the Probability of Hypertension Using Data Mining Techniques
Introduction: Hypertension is one of the most common health problems. As it has a major impact on other serious diseases such as cardiovascular diseases and strokes, and due to not having any specific symptoms, it is known as a silent killer. Therefore, proper diagnosis, control, and treatment of hypertension is crucial in health care systems and will indeed prevent the development of the other...
متن کاملApproximate resistivity and susceptibility mapping from airborne electromagnetic and magnetic data, a case study for a geologically plausible porphyry copper unit in Iran
This paper describes the application of approximate methods to invert airborne magnetic data as well as helicopter-borne frequency domain electromagnetic data in order to retrieve a joint model of magnetic susceptibility and electrical resistivity. The study area located in Semnan province of Iran consists of an arc-shaped porphyry andesite covered by sedimentary units which may have potential ...
متن کاملA Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis
Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...
متن کامل